Search CORE

838 research outputs found

Why We Read Wikipedia

Author: DeMaio T. J.
Gelman A.
Goel S.
Harkness J. A.
Jurgens D.
Kish L.
Klösgen W.
Krug S.
Lee B. K.
Mukhopadhyay P.
Salganik M. J.
Strauss A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

Publikationsserver der RWTH Aachen University

Detection of lipoarabinomannan (LAM) in urine is an independent predictor of mortality risk in patients receiving treatment for HIV-associated tuberculosis in sub-Saharan Africa: a systematic review and meta-analysis

Author: Flach Clare
Gupta-Wright Ankur
Lawn Stephen D
Peters Jurgens A
Publication venue: Desmond Tutu HIV Centre
Publication date: 01/01/2016
Field of study

BackgroundSimple immune capture assays that detect mycobacterial lipoarabinomannan (LAM) antigen in urine are promising new tools for the diagnosis of HIV-associated tuberculosis (HIV-TB). In addition, however, recent prospective cohort studies of patients with HIV-TB have demonstrated associations between LAM in the urine and increased mortality risk during TB treatment, indicating an additional utility of urinary LAM as a prognostic marker. We conducted a systematic review and meta-analysis to summarise the evidence concerning the strength of this relationship in adults with HIV-TB in sub-Saharan Africa, thereby quantifying the assay’s prognostic value.MethodsWe searched MEDLINE and Embase databases using comprehensive search terms for ‘HIV’, ‘TB’, ‘LAM’ and ‘sub-Saharan Africa’. Identified studies were reviewed and selected according to predefined criteria.ResultsWe identified 10 studies eligible for inclusion in this systematic review, reporting on a total of 1172 HIV-TB cases. Of these, 512 patients (44%) tested positive for urinary LAM. After a variable duration of follow-up of between 2 and 6months, overall case fatality rates among HIV-TB cases varied between 7% and 53%. Pooled summary estimates generated by random-effects meta-analysis showed a two-fold increased risk of mortality for urinary LAM-positive HIV-TB cases compared to urinary LAM-negative HIV-TB cases (relative risk 2.3, 95% confidence interval 1.6–3.1). Some heterogeneity was explained by study setting and patient population in sub-group analyses. Five studies also reported multivariable analyses of risk factors for mortality, and pooled summary estimates demonstrated over two-fold increased mortality risk (odds ratio 2.5, 95% confidence interval 1.4–4.5) among urinary LAM-positive HIV-TB cases, even after adjustment for other risk factors for mortality, including CD4 cell count.ConclusionsWe have demonstrated that detectable LAM in urine is associated with increased risk of mortality during TB treatment, and that this relationship remains after adjusting for other risk factors for mortality. This may simply be due to a positive test for urinary LAM serving as a marker of higher mycobacterial load and greater disease dissemination and severity. Alternatively, LAM antigen may directly compromise host immune responses through its known immunomodulatory effects. Detectable LAM in the urine is an independent risk factor for mortality among patients receiving treatment for HIV-TB. Further research is warranted to elucidate the underlying mechanisms and to determine whether this vulnerable patient population may benefit from adjunctive interventions.Electronic supplementary materialThe online version of this article (doi:10.1186/s12916-016-0603-9) contains supplementary material, which is available to authorized users

Cape Town University OpenUCT

LSHTM Research Online

Springer - Publisher Connector

PubMed Central

King's Research Portal

Asynchronous Training of Word Embeddings for Large Text Corpora

Author: Almuhareb A.
Boucher T.
Garten J.
Ghannay S.
Goikoetxea J.
Jurgens D. A.
Levy O.
Li Y.
Luong M.-T.
Mikolov T.
Recht B.
Socher R.
Socher R.
Stergiou S.
Vuurens J. B. P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/12/2018
Field of study

Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is typically sequentially processed and parameters are synchronously updated. Distributed architectures for asynchronous training that have been proposed either focus on scaling vocabulary sizes and dimensionality or suffer from expensive synchronization latencies. In this paper, we propose a scalable approach to train word embeddings by partitioning the input space instead in order to scale to massive text corpora while not sacrificing the performance of the embeddings. Our training procedure does not involve any parameter synchronization except a final sub-model merge phase that typically executes in a few minutes. Our distributed training scales seamlessly to large corpus sizes and we get comparable and sometimes even up to 45% performance improvement in a variety of NLP benchmarks using models trained by our distributed procedure which requires

1/10

of the time taken by the baseline approach. Finally we also show that we are robust to missing words in sub-models and are able to effectively reconstruct word representations.Comment: This paper contains 9 pages and has been accepted in the WSDM201

arXiv.org e-Print Archive

Crossref

Wavelet Based Fractal Analysis of Airborne Pollen

Author: A. Arneodo
A. Arneodo
A. Arneodo
A. Grossman
C. Goldberg
C. M. Arizmendi
C. M. Arizmendi
C. M. Arizmendi
H. Jurgens
J. D. Farmer
J. F. Muzy
J. F. Muzy
L. Moseholm
M. E. Degaudenzi
M. Käpylä
M. M. Bianchi
M. Nicollet
M. O’Rourke
M. Schröeder
N. B. Abraham
P. Comtois
P. Goupillaud
P. Grassberger
R. Leuschner
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/1998
Field of study

The most abundant biological particles in the atmosphere are pollen grains and spores. Self protection of pollen allergy is possible through the information of future pollen contents in the air. In spite of the importance of airborne pol len concentration forecasting, it has not been possible to predict the pollen concentrations with great accuracy, and about 25% of the daily pollen forecasts have resulted in failures. Previous analysis of the dynamic characteristics of atmospheric pollen time series indicate that the system can be described by a low dimensional chaotic map. We apply the wavelet transform to study the multifractal characteristics of an a irborne pollen time series. We find the persistence behaviour associated to low pollen concentration values and to the most rare events of highest pollen co ncentration values. The information and the correlation dimensions correspond to a chaotic system showing loss of information with time evolution.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

A survey of location inference techniques on Twitter

Author: Bouillot F
Chang H
Eisenstein J
Jurgens D
Mahmud J
Mahmud J
Paradesi SM
Paul MJ
Pennacchiotti M
Ritter A
Rui L
Schulz A
Publication venue: 'SAGE Publications'
Publication date: 01/01/2015
Field of study

The increasing popularity of the social networking service, Twitter, has made it more involved in day-to-day communications, strengthening social relationships and information dissemination. Conversations on Twitter are now being explored as indicators within early warning systems to alert of imminent natural disasters such as earthquakes and aid prompt emergency responses to crime. Producers are privileged to have limitless access to market perception from consumer comments on social media and microblogs. Targeted advertising can be made more effective based on user profile information such as demography, interests and location. While these applications have proven beneficial, the ability to effectively infer the location of Twitter users has even more immense value. However, accurately identifying where a message originated from or an author’s location remains a challenge, thus essentially driving research in that regard. In this paper, we survey a range of techniques applied to infer the location of Twitter users from inception to state of the art. We find significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Crossref

E-space: Manchester Metropolitan University's Research Repository

Sheffield Hallam University Research Archive

UWE Bristol Research Repository

Explore Bristol Research

Domain-independent Extraction of Scientific Concepts from Research Articles

Author: A Constantin
D Jurgens
J Beel
J Cohen
J Lehmann
K Balog
M Liakata
N Lao
O Bodenreider
S Hochreiter
V Pertsas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

We examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present two deep learning systems as baselines. In particular, we propose active learning to deal with different domains in our task. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.Comment: Accepted for publishing in 42nd European Conference on IR Research, ECIR 202

arXiv.org e-Print Archive

Crossref

Repositorium für Naturwissenschaften und Technik

Recommended from our members

Mercury's Moment of Inertia from Spin and Gravity Data

Author: Campbell Donald B.
Ghigo Frank D.
Giorgini Jon D.
Hauck II Steven A.
Jurgens Raymond F.
Margot Jean-Luc
Padovan Sebastiano
Peale Stanton J.
Solomon Sean C.
Yseboodt Marie
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Earth-based radar observations of the spin state of Mercury at 35 epochs between 2002 and 2012 reveal that its spin axis is tilted by (2.04 plus or minus 0.08) arc min with respect to the orbit normal. The direction of the tilt suggests that Mercury is in or near a Cassini state. Observed rotation rate variations clearly exhibit an 88-day libration pattern which is due to solar gravitational torques acting on the asymmetrically shaped planet. The amplitude of the forced libration, (38.5 plus or minus 1.6) arc sec, corresponds to a longitudinal displacement of ∼450 m at the equator. Combining these measurements of the spin properties with second-degree gravitational harmonics (Smith et al., 2012) provides an estimate of the polar moment of inertia of MercuryC/MR2 = 0.346 plus or minus 0.014, where M and R are Mercury's mass and radius. The fraction of the moment that corresponds to the outer librating shell, which can be used to estimate the size of the core, is Cm/C = 0.431 plus or minus 0.025

Columbia University Academic Commons

eScholarship - University of California

Rapid urine-based screening for tuberculosis in HIV-positive patients admitted to hospital in Africa (STAMP): a pragmatic, multicentre, parallel-group, double-blind, randomised controlled trial.

Author: Alufandika-Moyo Melanie
Chiume Lingstone
Corbett Elizabeth
Fielding Katherine
Flach Clare
Grint Daniel
Gupta-Wright Ankur
Lawn Stephen D
Peters Jurgens A
van Oosterhout Joep J
Wilson Douglas
Publication venue: Elsevier
Publication date: 19/07/2018
Field of study

BACKGROUND Current diagnostics for HIV-associated tuberculosis are suboptimal, with missed diagnoses contributing to high hospital mortality and approximately 374 000 annual HIV-positive deaths globally. Urine-based assays have a good diagnostic yield; therefore, we aimed to assess whether urine-based screening in HIV-positive inpatients for tuberculosis improved outcomes. METHODS We did a pragmatic, multicentre, double-blind, randomised controlled trial in two hospitals in Malawi and South Africa. We included HIV-positive medical inpatients aged 18 years or more who were not taking tuberculosis treatment. We randomly assigned patients (1:1), using a computer-generated list of random block size stratified by site, to either the standard-of-care or the intervention screening group, irrespective of symptoms or clinical presentation. Attending clinicians made decisions about care; and patients, clinicians, and the study team were masked to the group allocation. In both groups, sputum was tested using the Xpert MTB/RIF assay (Xpert; Cepheid, Sunnyvale, CA, USA). In the standard-of-care group, urine samples were not tested for tuberculosis. In the intervention group, urine was tested with the Alere Determine TB-LAM Ag (TB-LAM; Alere, Waltham, MA, USA), and Xpert assays. The primary outcome was all-cause 56-day mortality. Subgroup analyses for the primary outcome were prespecified based on baseline CD4 count, haemoglobin, clinical suspicion for tuberculosis; and by study site and calendar time. We used an intention-to-treat principle for our analyses. This trial is registered with the ISRCTN registry, number ISRCTN71603869. FINDINGS Between Oct 26, 2015, and Sept 19, 2017, we screened 4788 HIV-positive adults, of which 2600 (54%) were randomly assigned to the study groups (n=1300 for each group). 13 patients were excluded after randomisation from analysis in each group, leaving 2574 in the final intention-to-treat analysis (n=1287 in each group). At admission, 1861 patients were taking antiretroviral therapy and median CD4 count was 227 cells per μL (IQR 79-436). Mortality at 56 days was reported for 272 (21%) of 1287 patients in the standard-of-care group and 235 (18%) of 1287 in the intervention group (adjusted risk reduction [aRD] -2·8%, 95% CI -5·8 to 0·3; p=0·074). In three of the 12 prespecified, but underpowered subgroups, mortality was lower in the intervention group than in the standard-of-care group for CD4 counts less than 100 cells per μL (aRD -7·1%, 95% CI -13·7 to -0·4; p=0.036), severe anaemia (-9·0%, -16·6 to -1·3; p=0·021), and patients with clinically suspected tuberculosis (-5·7%, -10·9 to -0·5; p=0·033); with no difference by site or calendar period. Adverse events were similar in both groups. INTERPRETATION Urine-based tuberculosis screening did not reduce overall mortality in all HIV-positive inpatients, but might benefit some high-risk subgroups. Implementation could contribute towards global targets to reduce tuberculosis mortality. FUNDING Joint Global Health Trials Scheme of the Medical Research Council, the UK Department for International Development, and the Wellcome Trust

LSTM Online Archive

LSHTM Research Online

King's Research Portal

Electrons in High-Tc Compounds: Ab-Initio Correlation Results

Electronic correlations in the ground state of an idealized infinite-layer high-Tc compound are computed using the ab-initio method of local ansatz. Comparisons are made with the local-density approximation (LDA) results, and the correlation functions are analyzed in detail. These correlation functions are used to determine the effective atomic-interaction parameters for model Hamiltonians. On the resulting model, doping dependencies of the relevant correlations are investigated. Aside from the expected strong atomic correlations, particular spin correlations arise. The dominating contribution is a strong nearest neighbor correlation that is Stoner-enhanced due to the closeness of the ground state to the magnetic phase. This feature depends moderately on doping, and is absent in a single-band Hubbard model. Our calculated spin correlation function is in good qualitative agreement with that determined from the neutron scattering experiments for a metal.Comment: 21pp, 5fig, Phys. Rev. B (Oct. 98

arXiv.org e-Print Archive

Crossref

Communication calls produced by electrical stimulation of four structures in the guinea pig brain

One of the main central processes affecting the cortical representation of conspecific vocalizations is the collateral output from the extended motor system for call generation. Before starting to study this interaction we sought to compare the characteristics of calls produced by stimulating four different parts of the brain in guinea pigs (Cavia porcellus). By using anaesthetised animals we were able to reposition electrodes without distressing the animals. Trains of 100 electrical pulses were used to stimulate the midbrain periaqueductal grey (PAG), hypothalamus, amygdala, and anterior cingulate cortex (ACC). Each structure produced a similar range of calls, but in significantly different proportions. Two of the spontaneous calls (chirrup and purr) were never produced by electrical stimulation and although we identified versions of chutter, durr and tooth chatter, they differed significantly from our natural call templates. However, we were routinely able to elicit seven other identifiable calls. All seven calls were produced both during the 1.6 s period of stimulation and subsequently in a period which could last for more than a minute. A single stimulation site could produce four or five different calls, but the amygdala was much less likely to produce a scream, whistle or rising whistle than any of the other structures. These three high-frequency calls were more likely to be produced by females than males. There were also differences in the timing of the call production with the amygdala primarily producing calls during the electrical stimulation and the hypothalamus mainly producing calls after the electrical stimulation. For all four structures a significantly higher stimulation current was required in males than females. We conclude that all four structures can be stimulated to produce fictive vocalizations that should be useful in studying the relationship between the vocal motor system and cortical sensory representation

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Directory of Open Access Journals

FigShare